NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Gauguin, Descartes, Bayes: A Diurnal Golem’s Brain

https://doi.org/10.1145/3759429.3762631

Chandra, Kartik; Liu, Amanda; Ragan-Kelley, Jonathan; Tenenbaum, Joshua B (October 2025, ACM)

A "quine" is a deterministic program that prints itself. In this essay, I will show you a "gauguine": a probabilistic program that infers itself. A gauguine is repeatedly asked to guess its own source code. Initially, its chances of guessing correctly are of course minuscule. But as the gauguine observes more and more of its own previous guesses, it detects patterns of behavior and gains information about its inner workings. This information allows it to bootstrap self-knowledge, and ultimately discover its own source code. We will discuss how—and why—we might write a gauguine, and what we stand to learn by constructing one.
more » « less
Free, publicly-accessible full text available October 9, 2026
Lightweight and Locality-Aware Composition of Black-Box Subroutines

https://doi.org/10.1145/3729292

Bansal, Manya; Sharlet, Dillon; Ragan-Kelley, Jonathan; Amarasinghe, Saman (June 2025, Proceedings of the ACM on Programming Languages)

Subroutines are essential building blocks in software design: users encapsulate common functionality in libraries and write applications by composing calls to subroutines. Unfortunately, performance may be lost at subroutine boundaries due to reduced locality and increased memory consumption. Operator fusion helps recover the performance lost at composition boundaries. Previous solutions fuse operators by manually rewriting code into monolithic fused subroutines, or by relying on heavy-weight compilers to generate code that performs fusion. Both approaches require a semantic understanding of the entire computation, breaking the decoupling necessary for modularity and reusability of subroutines. In this work, we attempt to identify the minimal ingredients required to fuse computations, enabling composition of subroutines without sacrificing performance or modularity. We find that, unlike previous approaches that require a semantic understanding of the computation, most opportunities for fusion require understanding only data production and consumption patterns.Exploiting this insight, we add fusion on top of black-box subroutines by proposing a lightweight enrichment of subroutine declarations to expose data-dependence patterns. We implement our approach in a system called Fern, and demonstrate Fern’s benefits by showing that it is competitive with state-of-the-art, high-performance libraries with manually fused operators, can fuse across library and domain boundaries for unforeseen workloads, and can deliver speedups of up to 5× over unfused code.
more » « less
Free, publicly-accessible full text available June 10, 2026
Exo 2: Growing a Scheduling Language

https://doi.org/10.1145/3669940.3707218

Ikarashi, Yuka; Qian, Kevin; Droubi, Samir; Reinking, Alex; Bernstein, Gilbert Louis; Ragan-Kelley, Jonathan (March 2025, ACM)

Free, publicly-accessible full text available March 30, 2026
A Verified Compiler for a Functional Tensor Language

https://doi.org/10.1145/3656390

Liu, Amanda; Bernstein, Gilbert; Chlipala, Adam; Ragan-Kelley, Jonathan (June 2024, Proceedings of the ACM on Programming Languages)

Producing efficient array code is crucial in high-performance domains like image processing and machine learning. It requires the ability to control factors like compute intensity and locality by reordering computations into different stages and granularities with respect to where they are stored. However, traditional pure, functional tensor languages struggle to do so. In a previous publication, we introduced ATL as a pure, functional tensor language capable of systematically decoupling compute and storage order via a set of high-level combinators known as reshape operators. Reshape operators are a unique functional-programming construct since they manipulate storage location in the generated code by modifying the indices that appear on the left-hand sides of storage expressions. We present a formal correctness proof for an implementation of the compilation algorithm, marking the first verification of a lowering algorithm targeting imperative loop nests from a source functional language that enables separate control of compute and storage ordering. One of the core difficulties of this proof required properly formulating the complex invariants to ensure that these storage-index remappings were well-formed. Notably, this exercise revealed asoundness bugin the original published compilation algorithm regarding the truncation reshape operators. Our fix is a new type system that captures safety conditions that were previously implicit and enables us to prove compiler correctness for well-typed source programs. We evaluate this type system and compiler implementation on a range of common programs and optimizations, including but not limited to those previously studied to demonstrate performance comparable to established compilers like Halide.
more » « less
Heat Treatments for Minimization of Residual Stresses and Maximization of Tensile Strengths of Scalmalloy® Processed via Directed Energy Deposition

https://doi.org/10.3390/ma17061333

Boillat-Newport, Rachel; Isanaka, Sriram Praneeth; Kelley, Jonathan; Liou, Frank (March 2024, Materials)

Scalmalloy® is an Al-Mg-Sc-Zr-based alloy specifically developed for additive manufacturing (AM). This alloy is designed for use with a direct aging treatment, as recommended by the manufacturer, rather than with a multistep treatment, as often seen in conventional manufacturing. Most work with Scalmalloy® is conducted using powder bed rather than powder-fed processes. This investigation seeks to fill this knowledge gap and expand beyond single-step aging to promote an overall balanced AM-fabricated component. For this study, directed energy deposition (DED)-fabricated Scalmalloy® components were subjected to low-temperature treatments to minimize residual stresses inherent in the material due to the layer-by-layer build process. X-ray diffraction (XRD) indicated the possibility of stress minimization while reducing the detriment to mechanical strength through lower temperature treatments. Microstructural analyses consisting of energy dispersion spectroscopy (EDS) and electron backscatter diffraction (EBSD) revealed the presence of grain growth detrimentally affecting the strength and elongation made possible by very small grains inherent to AM and rapid solidification. Tensile testing determined that treatment at 175 °C for 1 h provides the best relief from the existing residual stresses; however, this is accompanied by a diminishment in the yield and tensile strength of 19 and 9.5%, respectively. It is noted that treatment at 175 °C for 2 h did not provide as great of a decrease in residual stresses, theorized to be the result of grain growth and other strengthening mechanisms further stressing the structure; however, the residual stresses are still significantly diminished compared with the as-built condition. Furthermore, a minimal reduction of the tensile strengths indicates the possibility of finding a balance between property diminishment and stress state through the work proposed here.
more » « less
Full Text Available
Distributions for Compositionally Differentiating Parametric Discontinuities

https://doi.org/10.1145/3649843

Michel, Jesse; Mu, Kevin; Yang, Xuanda; Bangaru, Sai Praveen; Collins, Elias Rojas; Bernstein, Gilbert; Ragan-Kelley, Jonathan; Carbin, Michael; Li, Tzu-Mao (April 2024, Proceedings of the ACM on Programming Languages)

Computations in physical simulation, computer graphics, and probabilistic inference often require the differentiation of discontinuous processes due to contact, occlusion, and changes at a point in time. Popular differentiable programming languages, such as PyTorch and JAX, ignore discontinuities during differentiation. This is incorrect forparametric discontinuities—conditionals containing at least one real-valued parameter and at least one variable of integration. We introduce Potto, the first differentiable first-order programming language to soundly differentiate parametric discontinuities. We present a denotational semantics for programs and program derivatives and show the two accord. We describe the implementation of Potto, which enables separate compilation of programs. Our prototype implementation overcomes previous compile-time bottlenecks achieving an 88.1x and 441.2x speed up in compile time and a 2.5x and 7.9x speed up in runtime, respectively, on two increasingly large image stylization benchmarks. We showcase Potto by implementing a prototype differentiable renderer with separately compiled shaders.
more » « less
Full Text Available
Inferring the Future by Imagining the Past

Chandra, Kartik; Chen, Tony; Li, Tzu-Mao; Ragan-Kelley, Jonathan; Tenenbaum, Joshua (December 2023, Proceedings of 2023 Conference on Neural Information Processing Systems)

A single panel of a comic book can say a lot: it can depict not only where the characters currently are, but also their motions, their motivations, their emotions, and what they might do next. More generally, humans routinely infer complex sequences of past and future events from a static snapshot of a dynamic scene, even in situations they have never seen before. In this paper, we model how humans make such rapid and flexible inferences. Building on a long line of work in cognitive science, we offer a Monte Carlo algorithm whose inferences correlate well with human intuitions in a wide variety of domains, while only using a small, cognitively-plausible number of samples. Our key technical insight is a surprising connection between our inference problem and Monte Carlo path tracing, which allows us to apply decades of ideas from the computer graphics community to this seemingly-unrelated theory of mind task.
more » « less
Full Text Available
SLANG.D: Fast, Modular and Differentiable Shader Programming

Bangaru, Sai; Wu, Lifan; Munkberg, Jacob; Bernstein, Gilbert; Ragan-Kelley, Jonathan; Durand, Fredo; Lefohn, Aaron; He, Yong (December 2023, Transactions on Graphics)

Full Text Available
How to Guess a Gradient

Singhal, Utkarsh; Cheung, Brian; Chandra, Kartik; Ragan-Kelley, Jonathan; Tenenbaum, Joshua B; Poggio, Tomaso; Yu, Stella X (December 2023, arXivorg)

How much can you say about the gradient of a neural network without computing a loss or knowing the label? This may sound like a strange question: surely the answer is “very little.” However, in this paper, we show that gradients are more structured than previously thought. Gradients lie in a predictable low-dimensional subspace which depends on the network architecture and incoming features. Exploiting this structure can significantly improve gradient-free optimization schemes based on directional derivatives, which have struggled to scale beyond small networks trained on toy datasets. We study how to narrow the gap in optimization performance between methods that calculate exact gradients and those that use directional derivatives. Furthermore, we highlight new challenges in overcoming the large gap between optimizing with exact gradients and guessing the gradients.
more » « less
Full Text Available
Acting as Inverse Inverse Planning

https://doi.org/10.1145/3588432.3591510

Chandra, Kartik; Li, Tzu-Mao; Tenenbaum, Joshua; Ragan-Kelley, Jonathan (July 2023, ACM)

Great storytellers know how to take us on a journey. They direct characters to act—not necessarily in the most rational way—but rather in a way that leads to interesting situations, and ultimately creates an impactful experience for audience members looking on. If audience experience is what matters most, then can we help artists and animators directly craft such experiences, independent of the concrete character actions needed to evoke those experiences? In this paper, we offer a novel computational framework for such tools. Our key idea is to optimize animations with respect to simulated audience members’ experiences. To simulate the audience, we borrow an established principle from cognitive science: that human social intuition can be modeled as “inverse planning,” the task of inferring an agent’s (hidden) goals from its (observed) actions. Building on this model, we treat storytelling as “inverse inverse planning,” the task of choosing actions to manipulate an inverse planner’s inferences. Our framework is grounded in literary theory, naturally capturing many storytelling elements from first principles. We give a series of examples to demonstrate this, with supporting evidence from human subject studies.
more » « less

« Prev Next »

Search for: All records